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Abstract 

In this second part of our two-part paper, we invoke the stochastic maximum principle, conditional 
Hamiltonian and the coupled backward-forward stochastic differential equations of the first part [Q] to 
derive team optimal decentralized strategies for distributed stochastic differential systems with noiseless 
information structures. We present examples of such team games of nonlinear as well as linear quadratic 
forms. In some cases we obtain closed form expressions of the optimal decentralized strategies. 

Through the examples, we illustrate the effect of information signaling among the decision makers 
in reducing the computational complexity of optimal decentralized decision strategies. 
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I. Introduction 

In the first part Q] of this two part paper, we have derived team and person-by-person 
optimality conditions for distributed stochastic differential systems with noiseless decentralized 
information structures. Specifically, we considered distributed (coupled) stochastic differential 
equations of Ito form driven by Brownian motions, and decision makers acting on decentralized 
noiseless i) nonanticipative and ii) feedback information structures, and we have shown existence 
of team and person-by-person optimal strategies utilizing relaxed and regular strategies. Then 
we applied tools from the classical theory of stochastic optimization with some variations to 
derive team and person-by-person optimality conditions [|2]|-[|5]|. 

The first important concussions drawn from [Q] is that the classical theory of stochastic opti- 
mization is not limited in mathematical concepts and procedures by the centralized assumption 
based upon which it is developed. It is directly applicable to differential systems consisting 
of multiple decision makers, in which the acquisition of information and its processing is 
decentralized or shared among several locations, while the decision makers actions are based on 
different information structures. The second important conclusion drawn from [lj is that team 
and person-by-person optimality conditions are given by a Hamiltonian system of equations 
consisting of a conditional Hamiltonian, and coupled forward-backward stochastic differential 
equations. 

The work in 0] compliments the current body of knowledge on static team game theory 
ll6l- [fT0l . and decentralized decision making [|9l- [|20l . and more recent work in (|2T|- [|26l . by 
introducing optimility conditions for general stochastic nonlinear differential systems. 

The main remaining challenge is to determine whether under the formulation and assump- 
tions introduced in JT], we can derive optimal decentralized strategies for nonlinear and linear 
distributed stochastic differential systems, understand the computational complexity of these 
strategies compared to centralized strategies, and determine how this complexity can be reduced 
by allowing limited signaling among the different decision makers. 

Therefore, in this second part of the two-part investigation, we apply the optimality conditions 
derived in the first part to a variety of linear and nonlinear distributed stochastic differential 
systems with decentralized noiseless information structures to derive optimal strategies. Our 
investigation leads to the following conclusions. 

1) When the dynamics are linear in the decision variables and nonlinear in the state variables, 
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and the pay-off is quadratic in the decision variable and nonlinear in the state variable, the 
optimal decentralized strategies are given in terms of conditional expectations with respect 
to the information structure on which they act on; 

2) When the dynamics are linear in the state and the decision variables, and the pay-off is 
quadratic in the state and the decision variables, then the optimal decentralized strategies 
are computed in closed form, much as in the classical Linear- Quadratic Theory. However, 
when the pay-off includes coupling between the decision makes the optimal strategy of 
any player is also a function of the average value of the optimal strategies of the other 
players. 

3) The computation of the optimal strategies involves the solution of certain equations, which 
can be formulated and solved via fixed point methods. 

4) The computation complexity of the optimal decentralized strategies can be reduced by 
signaling specific information among the decision makers and/or by considering certain 
structure for the distributed system and pay-off. 

The rest of the paper is organized as follows. In Section lU we introduce the distributed 
stochastic system with decentralized information structures and the main assumption, and we 
state the optimality conditions derived in JTJ. In Section [Till we apply to optimality conditions 
to several forms of team games, and we show how the optimal decentralized strategies are 
computed. For the case of linear differential dynamics and quadratic pay-off we obtain explicit 
expressions of the optimal decentralized team strategies. The paper is concluded with some 
comments on possible extensions of our results. 



In this section we introduce the mathematical formulation of distributed stochastic systems with 
decentralized noiseless information structures, and the optimality conditions derived in JTJ. 



The formulation in [01 presupposes a fixed probability space with filtration, ^2,F, {F 0i t : t G 
[0, T]},PJ satisfying the usual conditions, that is, (fi,F, P) is complete, F .o contains all P-null 
sets in F. All a— algebras are assumed complete and right continuous, that is, F ,t = F ,<+ = 
f] s>t F , a , Vt G [0, T). We use the notation ¥ T = {F ,t : t G [0, T}} and similarly for the rest of 
the filtrations. 



II. Team and Person-by-Person Optimality Conditions 
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The minimum principle in 0] is derived utilizing the following spaces. Let Lf, ([0, T], M. n ) C 
L 2 (Vt x [0,T],dP x dt,R n ) = L 2 ([0,T},L 2 (n,R n )) denote the space of F r -adapted random 
processes {z(t) : t G [0,T]} such that 



which is a sub-Hilbert space of L 2 ([0, T],L 2 (tt, W 1 )). Similarly, let L| T ([0, T], £(R m , R n )) C 
L 2 ([0, T], L 2 (f2, £(R m , R n ))) denote the space of F r — adapted n x m matrix valued random 
processes (E(t) : t G [0,T]} such that 



A. Distributed Stochastic Differential Decision Systems 

A stochastic differential decision or control system is called distributed if it consists of an 
interconnection of at least two subsystems and decision makers, whose actions are based on 
decentralized information structures. The underlying assumption is that the decision makers are 
allowed to exchange information on their law or strategy deployed, but not their actions. 



Let yfl, F, {F 0i t : t G [0, T]}, Fj denote a fixed complete filtered probability space on which we 
shall define all processes. At this state we do not specify how {F o t : t G [0,T]} came about, 
but we require that Brownian motions are adapted to this filtration. 

Admissible Decision Maker Strategies 

The Decision Makers (DM) {u l : % G Z N } take values in a closed convex subset of metric spaces 
{(MM) : % G Z N }. Let Q l T = {Qi t : t G [0,T]} C {F , t : t G [0,T]} denote the information 
available to DM i, \/i G Z N . The admissible set of regular strategies is defined by 



Clearly, U; e9 [0,T] is a closed convex subset of Lp T ([0, T],M. n ), for i = l,2,...,iV, and u l : 
[0, T]xfl4 K\ {u\ : t G [0, T}} is ^-adapted, Vi G Z N . 

An iV tuple of DM strategies is by definition (u 1 , u 2 ,..., u N ) G [0, ^] = x^^^^O, T]. 
Distributed Stochastic Systems 

On the probability space ^f2,F, {F ,< : t G [0,T]},Pj the distributed stochastic system consists 
of an interconnection of N subsystems, and each subsystem i has, state space IR n % action space 
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A* C M d % an exogenous noise space M m % and an initial state x' l (0) = x l Q , identified by the 
following quantities. 

(51) x l (0) = x l : an Revalued Random Variable; 

(52) {W % (t) : t E [0, T]}: an M m,; -valued standard Brownian motion which models the 
exogenous state noise, adapted to Ft, independent of x l (0); 

Each subsystem is described by coupled stochastic differential equations of Ito type as follows. 

dx\t) =f i {t,x i {t),ui)dt + a i {t,x i {t),ui)dW i {t) + f ij {t,x j {t),u{)dt 

N 

+ v l3 {t,x j {t),ui)dW j {t), x i {0) = x i , te(o,T], ieZ N . (2) 
Define the augmented vectors by 

W= (W\W 2 ,...,W N ) Gl m , u={u\u 2 ,...,u N ) G M. d , x = (x\x 2 ,...,x N ) eW l . 

The distributed system is described in compact form by 

dx(t) = f(t,x(t),u t )dt + a(t,x(t),u t ) dW(t), x(0) = x , t e (0,T], (3) 

where / : [0, T] x W l x A^ — y W n denotes the drift and a : [0, T] x W n x — y jC(R m , R«) 
the diffusion coefficients. 

Pay-off Functional 

Given a«G US [0, T] and © we define the reward or performance criterion by 

J(u) = J{u\u 2 ,...,u N ) = E lj Q e(t,x(t),u t )dt + ip(x(T)\, (4) 

where £ : [0,T] x M n x — )• (—00,00] denotes the running cost function and ip : IR n — y 
(—00, 00], the terminal cost function. 

B. Team and Per son-by -Person Optimality 

In this section we give the precise definitions of team and person-by-person optimality for 
regular strategies. 
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We consider the following information structures. 

(NIS): Nonanticipative Information Structures. u % is adapted to the filtration Q % T c F T gener- 
ated by the a— algbebra of nonlinear nonanticipative measurable functionals of any combination 
of the subsystems Brownian motions {(W 1 ^), W 2 (t), W N (t)) : t G [0,T]}, Vi G Z N . 
This is often called open loop information, and it is the one used in classical stochastic control 
with centralized full information to derive the maximum principe ll27ll . 

(FIS): Feedback Information Structures. u % is adapted to the filtration Q^' u generated by 
the a— algebra = a{z l (s) : < s <t},t G [0, T], where the observables z % are nonlinear 
nonanticipative measurable functionals of any combination of the states defined by 

z\t) = h%x), ti : [0,T] x C([0,T],M n ) — > R k \ i € Z N . (5) 

Note that the index u emphasizes the fact that feedback strategies depend on u. 
The set of admissible regular feedback strategies is defined by 

U^' z " [0, T] = lu e Ug [0, T] : u\ is Q$ - measurable, t E [0, T], i — 1, . . . , n\. (6) 

Problem 1. (Team Optimality) Given the pay-off functional (EJ), constraint (TJ]) the N tuple of 
strategies u° = (u 1,0 , u 2 '°, . . . , u N, °) G Ur^[0,T] is called nonanticipative team optimal if it 
satisfies 

J(n 1 '°, u 2 '°, u N '°) < J{u\ u 2 ,..., u N ), = (u\ u 2 , . . . , u N ) G Ug[0, T] (7) 

Any u° G Uf^[0,T] satisfying is called an optimal decision strategy (or control) and the 
corresponding x°(-) = x(-;u°(-)) (satisfying d?])J is called an optimal state process. Similarly, 
feedback team optimal strategies are defined with respect to u° G Ufeg [0, T\. 

An alternative approach to handle such problems with decentralized information structures is 
to restrict the definition of optimality to the so-called person-by-person equilibrium. 
Define 

J(v, O = J{u\ u 2 ,..., u'- 1 ^, u i+ \ ..,/) 
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Problem 2. (Person-by-Person Optimality) Given the pay-off functional (I?]), constraint the 
N tuple of strategies u° = (u 1 ' , u 2 '°, . . . : u N '°) G Ur^2[0,T] is called nonanticipative person- 
by-person optimal if it satisfies 

j{u ifi ) vr ifi )<J{u\vr ifi ) i W e u; e9 [o,r], g z N . (8) 

Similarly, feedback person-by-person optimal strategies are defined with respect to u° G Vreg [0, T\. 

Conditions © are analogous to the Nash equilibrium strategies of team games consisting of a 
single pay-off and N DM. The person-by-person optimal strategy states that none of the N DM 
with different information structures can deviate unilaterally from the optimal strategy and gain 
by doing so. 

C. Team and Person-by-Person Optimality Conditions 

In this section we first introduce the assumptions on {f,a,h,£,<p} and then we state the 
optimality conditions derived in 0]. 

Let B^ T ([0, T], L 2 (n,R n )) denote the space of F T -adapted W 1 valued second order random 
processes endowed with the norm topology || ■ || defined by 

|| x \\ 2 = sup E|x(t)|^n. 
te[o,T] 

The main assumptions are stated below. 
Assumptions 1. (Main assumptions) 

V 1 is closed and convex subset ofR d \ Wi G Z^r, E|x(0)|jRn < oo and the maps of {/, a, £,(p} 
satisfy the following conditions. 

(Al) / : [0, T] x R n x A (Ar) — ► W 1 is continuous in (t, x, u) and continously differentiable 
with respect to x, u; 

(A2) o : [0,T] x R n x A (JV) — > C(R m ;R n ) is continuous in {t,x,u) and continously 
differentiable with respect to x, u; 

(A3) The first derivatives of {f x , a x , f u , a u } are bounded uniformly on [0, T] x R n x A^. 

(A4) £ : [0, T] x R n x A^) — > ( — oo,oo] is Borel measurable, continuously differentiable 
with respect to (x,u), ip : [0,T] x R n — > (—00,00] is continously differentiable with 
respect to x, £(0, 0, t) is bounded, and there exist K±, K 2 > such that 

\£ x (t,x,u)\ R n + \£ u (t,x,u)\ R d < Kt(l + \x\wtn + \u\ R d), \(p x {x)\un < K 2 (l + \x\ Rn ). 
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The following lemma states existence of solutions and their continuous dependence on the 
decision variables. 

Lemma 1. Suppose Assumptions [7] hold. Then for any ¥ 00 -measurable initial state x having 
finite second moment, and any u G [0, T], the following hold. 

(1) System (TJ]) has a unique solution x G Bjjj?([0, T], L 2 (Q, M. n )) having a continuous 
modification, that is, x G C([0, T], M. n ), P— a.s, Mi G Z N . 

(2) The solution of system (TJ]) is continuously dependent on the control, in the sense that, 
as u^ a — > u 1 ' in U reg [0,T}, Mi G Z N , x a x° in B%,([0, T], L 2 (fi, W 1 )), Mi G Z N . 

These statements also hold for feedback strategies u G Ure 5 [0,T\. 

Proof: Proof is identical to that of [|4]|. ■ 
Note that the differentiability of /, a, t with respect to u can be removed without affecting 
the results (by considering either needle variations when deriving the maximum principle or by 
deriving the maximum principle for relaxed strategies and then specializing it to regular strategies 
as in El). 

Assumptions \T\ are used to derive optimality conditions for stochastic control problems with 
nonanticipative centralized strategies. However, for stochastic control problems with feedback 
centralized strategies additional assumptions are required to avoid certain technicalities associ- 
ated with the derivation of the maximum principle. In [1] we identified these assumptions for 
decentralized randomized feedback strategies; the main theorems are stated below. 

Assumptions 2. The following holds. 

(El) The diffusion coefficients a is restricted to the map a : [0, T] x M n — > £(R n , R") (e.g., 
it is independent of u) and <r(-, •) and a" 1 ^, •) are bounded. 

Define the a— algebras 

J%f> w = a{x(0),W(s) : < s < t}, = a{x(s) : < s < t}, Mt G [0, T\. 

Under Assumptions [Q H if u G U^' 2 " [0, T] then JF^f^ = TgJ , Mt G [0, T\. Thus, for any u z G 
U^^O, T] which is ^' "—adapted there exists a function measurable to a sub- a— algebra of 
r u C Tlf' W such that u\(u) = <f?(t, x(0), W{- A t, u)),F-a.s.u E tt,Mt e [0,T],i= 1,...N. 
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Define all such adapted nonanticipative functions by 

U* reg [0,T] = {u* G L^([0,T],R*) : u\ G U^ u [0,T]}, Vz G Z„. (9) 
Next, we introduce the following additional assumptions. 

Assumptions 3. The following hold. 

(E2) U r t; [0 , T] is dense in \f reg [0 , T] , Mi G Z N . 

Under Assumptions Q] it can be shown that J(-) is continuous in the sense of U^[0,T] and by 
Assumptions [3] we have inf MgxJV ^ j QT j J(w) = inf MgxJV ^>"[ OT ] J (it). Hence, the necessary 
conditions for feedback information structures u G U r eg |0,T| to be optimal are those for 
which nonanticipative information structures u G U^[0,T] are optimal. 
We now show that under Assumptions [Q |2]then Assumptions |3] holds. 

Theorem 1. Consider Problem [7] under Assumptions \B El Then 

inf J(u) = inf J(u). 

u&x ? =1 % eg [o,t] ue xti^ig [o,n 

Proof: We follow the procedure in [|28l . For any G U^[0,T] which is (^'"—adapted 
we can define the set JT reg [0, T],i = 1, . . . , N via ©. Let u G U^[0, T] = x^JJ^fO, T] and 
for A; = -jg, define 

for < t < k uq G 

fc /(n-i)fe M s rfs for nk < *fa + 1)*, n = l,...,M-l, 

for i = 1, ... N. Then u M = {u\^ . . . , u£ t ) G U^[0, T], and w fe — ► « in L^flO, T],R d ). We 
need to show that w fc G U^' 2 ^ [0, T). Let Xfc(-) denote the trajectory corresponding to w fc ., and 
.F £ the a— algebra generated by {xk(s) : < s < t}. Define 

h(t) = / a(s,x k (s))dW(t) =x k (t) - x(0) - / f(s,x k (s),u k>s )ds, (11) 



and 

W(t) = / (7(s,x fc (s))- 1 d/ fc (s). (12) 
Since w fc G U re5 [0,T], then 7 fe (t) is J^j —measurable, for < i < k. Hence, 

^'" = 7$, 0<t<k. (13) 
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x u 

Therefore, Uk,t is F Q h t —measurable for k < t < 2k. From the above equations it follows that 
CCB also holds for k <t < 2k, and by induction that J^f ),W = Jjf,V* G [0,T]. Therefore, 

x u 

u\ t is also measurable with respect to JFq \ . Hence, for any u\ which is measurable with respect 
to a nonanticipative functional z % = h l (t, x) there exists a nonanticipative functional of x(Q),W 
which realizes it. 

Now, it is sufficient to show that as u i,Q — > u l in lf reg [0,T], Wi G Z N , then J{u a ) — > J{u). 
Utilizing Assumptions Q] we can show that Esup sg j 04 ] \x a (s) — x(s)|r» converges to zero as 
a — > 0, hence it is sufficient to show that \J(u a ) — J(u)\ also converges to zero, as a — > 0. 
By the mean value theorem we have the following inequality. 



|J(0- J(u)\ <K X E\ / (\x Q (t)\ m u + \uf\ Rd + \x(t)\ mn + \u t \ Rd + l 

L J[0,T] V 

■ (\x a (t) - x(t)\ Rn + |< - u t | Rd JcZtj 

+ K 2 E{ (|z a (T)| K » + \x(T)\ Un + l) |x a (T) - a;(t)| R n}. (14) 

Since Esup s6 [ 0t ] |x a (s) — x(s)| R n — > as a — > 0, then | J(u a ) — J(u)\ also converges to zero, 
as a — > 0. 

■ 

Thus, under the assumptions of Theorem [T] if u G Ur^' z [0,T] achieves the infimum of J(u) 
then it is also optimal with respect to U r ^[0,T] = x^ 1 U* ei [0, T}. Consequently, the necessary 
conditions for feedback information structures u G Ur^' 2 [0, T] to be optimal are those for 
which nonanticipative information structures u G U^[0,T] are optimal. 

In the next remark we give an example for which Assumptions |2] hold, and hence Theorem Q] is 
valid. 

Remark 1. Suppose x 1 and x 2 are governed by the following stochastic differential equations 

dx 1 ^) =f 1 {t,x 1 (t),u 1 (t))dt + a 1 (t,x 1 (t))dW 1 (t), x 1 (0) = x 1 , (15) 
dx 2 (t) =f 2 (t, x\t), x 2 (t), u^t), u 2 (t))dt + a 2 (t, x 1 (t),x 2 (t))dW 2 (t), x 2 (0) = x 2 , (16) 
z\t) =h l (t,x l (t)), z 2 (t) = h 2 (t,x l (t),x 2 (t)), tG[0,T], (17) 

where h 1 , h 2 are measurable, W l {-), W 2 (-) are independent, andu 1 G tfe '"[0, T], u 2 G U 2 ' z ' u > z2 ' w 



eg L ' J ' ^- ^reg 

i,u A 
t = 



If we further assume that a % (-,■), o %) ■) are bounded, and As sumptions \T} hold, then T§ 

r-X 1 

0,t 



a{x x (s) : < s < t} = jf^^ 1 = a{x 1 (0),W(s) : < s < t}. Moreover, it can be shown 
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that ^ — t ' Th^ti we can find U r [0, T],i = 1,2 for which (E2) holds, 

and thus Theorem \J} is valid. 

Next, we state the main theorem which gives necessary and sufficient optimality conditions for 
nonanticipative and feedback decisions. 
Define the Hamiltonian 

U : [0,T] x W x W l x £(R m ,R n ) x A (JV) — ► R, 

by 

n(t, x, ^, Q, u) = (f(t, x, u), V) + tr(Q*a(t, x, u)) + £(f, x, u), te [0, T]. (18) 

For any u G U$J[0, T], the adjoint process is (ip, Q) G L| T ([0, T], R n ) x L£ T ([0, T],£(R m , R n )) 
satisfies the following backward stochastic differential equation 

dip(t) = -f*(t,x(t),u t )t/j(t)dt-V Q (t)dt-£ x (t : x(t),u t )dt + Q(t)dW(t), t G [0,T), 

= -H x (t, x(t), i[)(t), Q(t), ut)dt + Q(t)dW(t), (19) 

ij(T) = <p x (x(T)) (20) 

where V Q G ^ T ([0,T], R") is given by (V Q (t),C) = tr(Q*(t)a x (t,x(t),u t ;()),t G [0,T] (e.g., 
Vg(t) = ELi (4 A) (t,x(t),w t )yg( fe )(i), t G [0,T], a® is the fct/i column of a, af ] is the 
derivative of with respect to the state, for k — 1, 2, . . . , m, is the kth column of Q). 
The state process satisfies the stochastic differential equation 

dx{t) = f(t, x(t), u t )dt + a(t, x(t),u t )dW(t), t G (0, T], 

= H^t, x(t), ^(t), Q(t), u t )dt + a(t, x(t),u t )dW(t), (21) 

x(0) = x (22) 

The main theorem is stated below. 

Theorem 2. (Team optimality) Consider Problem\J} under Assumptions^ and assume existence 
of an optimal team strategy. 

(I) Suppose Ft is the filtration generated by x(0) and the Brownian motion {W(t) : t G 
[0,31}. 

Necessary Conditions. For an element u° G Uf^g [0, T] wzY/z ?/ze corresponding solution 
x° G -B|^([0, T], L 2 (fl, R n )) to be team optimal, it is necessary that the following hold. 
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(1) There exists a semi martingale with the intensity process (if)°, Q°) G ([0, T],WL 
L 2 ¥T ([0,T},C(R m ,R n )). 

(2) The variational inequality is satisfied: 

N T 

]Te{ / (n U i(t,x%t),r(t),Q°(t),<)^i-<'°)dt}>o, V«GU5[0,T] 

1=1 J° 

(23) 

(3) The process Q°) G L| T ([0, T],R n ) x Lj T ([0, T),£(R m , R n )) is a unique 
solution of the backward stochastic differential equation 4791) . ( 1201) . swc/z ?/?a? 
u° G Ur^g[0,T] satisfies the point wise almost sure inequalities with respect 
to the a-algebras Q\ t C F 0)t , t G [0, T], i = 1, 2, . . . , TV : 

Vu* G A\a.e.t G [0,T],P|gj ( -a.s.,z = 1,2, ... ,7V (24) 

Sufficient Conditions. Lef (x°(-),u°(-)) denote an admissible state and deci- 
sion pair and let ip°(-) the corresponding adjoint processes. 
Suppose the following conditions hold. 
(Bl) H(t,-,ip,Q,-),t G [0,T], is convex in (x,u) G M n x 
(B2) </?(•)> ^ convex m x G M n . 

77zen (i°(-),it°(-)) z's optimal if it satisfies ( [24\l . 
(II) Suppose Ft is the filtration generated by x(0) and the Brownian motion {W(t) : t G 

[0, T}}, and Assumptions\3\hold. The necessary and sufficient conditions for a feedback 

element u° G vieg' z [0,T] to oe optimal are given by the statements under Part (I) 

vwY/z Gi t replaced by Gof,Vt G [0,T]. 

froo/- See (29). 

■ 

Next, we have the following corollary regarding person-by-person optimality. 

Corollary 1. (Person-by-person optimality) Consider Problem\2\ under the conditions of Theo- 
rem [2] Then the necessary and sufficient condition of Theorem [2] hold with variational inequality 
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(122]) replaced by 

E^J^('H u ,(t 1 x o (t) 1 ij o (t),Q o (t),u o t ),ul-ui' o )dt^>0, \/u l E U; efl [0,T], \/ieZ N . (25) 

Proof: See El. ■ 
It can be shown by contradiction that the team and person-by-person optimality conditions 
presented above are equivalent (see [DQ). 

Often in the application of the minimum principle we need to identify the the martingale term in 
the adjoint process equation. One approach how to determine Q is discussed in the next remark. 

Remark 2. Utilizing the Riesz representation theorem for Hilbert space martingles, in /UJ/ the 
adjoint process Q(-) in the adjoint equation 4791) . is identified as Q[t) = tf) x (t)a(t,x{t),u t ), 
provided ip x exists (i.e., f, a, £, are twice continuously differentiable and f xx , a xx ,£ xx , (p xx are 
uniformly bounded). 

Note that from the team optimality conditions presented above we also deduce the optimality 
conditions for centralized full and partial information strategies. This observation is stated in the 
next remark (for partial information strategies) 

Remark 3. Consider Problem [7J under the conditions of Theorem [2] Part (I) and Part (II) and 
assume u % are adapted the centralized partial information Qt C Ft, and centralized partial 
information C/fT C T^ T , respectively. Then the necessary conditions for Q T — adapted u l 's are 
given by the following point wise almost sure inequalities 

E{H(t, x°(t),jp°(t), Q°(t),u)\g ,t} > E{H(t, x°(t),^(t), Q(t), u°)\g ,t}, 

\/u E A {N \a.e.t E [0,T},F\g ot - a.s., (26) 

where {x (t),ip°(t),Q°(t) : t G [0,T]} are the solutions of the Hamiltonian system (EJJ), rf22l) . 
( 1791) . ( 1201 ). while for —adapted u l 's the necessary condition is ( 1261) with the conditioning 
done with respect to ■ This corresponds to the partial information investigated in /[?]/. 

III. Optimal Team Strategies for Classes of Games 

We are now ready to derive explicit optimal team strategies for general classes of team games, 
when the dynamics and the reward have certain structures. These include nonlinear as well as 
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linear distributed systems. Our focus is on optimal decentralized strategies which are given in 
a) closed form involving conditional expectations based on the information structures available 
to the DM's and b) closed expressions similar to the classical Linear-Quadratic Theory. 
First, we define the main classes of team games we shall investigate. 

Definition 1. (Team games with special structures) We define the following forms of team games. 
(GNF): Generalized Normal Form. The team game is said to have "generalized normal form" 
if 

N 

f(t,x,u) =b(t,x) + g(t,x)u, g(t,x)u = ^2g {3 \t,x)u : ', (27) 

3=1 



a(t,x,u) = n^{t,x) K^(t,x)... K^{t,x) ] 



+ 



Si(t,x)u S2(t,x)u ... S m {t,x)u 



(28) 



1 1 

£(t, x, u) =-(u, R(t, x)u) + -X(t, x) + (u, r){t, x)), (29) 

N N N 

where (u, R(t, x)u) =^^^^u 1 '* Rij(t, x)u^ , (u, r](t, x)) = u l '*r] l (t, x), 

i=i j=i i=i 

and Ky' (• , •) is the ith column of an n x m matrix k(-, ■), for i = 1,2, ... ,m, •) is an 

n x d matrix, for i — 1,2, ... ,m, R(-, ■) is symmetric uniformly positive definite, and A(-, •) is 

uniformly positive semidefinite. 

GNF refers to the case when the drift and diffusion coefficients f, a are linear in the decision 
variable u, and the pay-off function £ is quadratic in u, while f, a, £, tp are nonlinear in x. 



(SGNF): Simplified Generalized Normal Form. A team game is said to have "simplified 
generalized normal form" if it is of generalized normal form and a(t, x, u) is independent of u, 
that is, Sj = 0,l<j<min 

SGNF refers to the case when f is linear in u, a is independent of u, £ is quadratic in u, and 
f, a, £ are nonlinear in x. 
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(NF): Normal Form. A team game is said to have " normal form" if 



f(t, x, u) =A{t)x + b{t) + B(t)u, 
a(t,x,u) = Ki(f)X K 2 (t)x ... 




(30) 




and «;(•) e £{R n ;W l ) for i = l,...,m, Si(-) G for i = l,...,m, and R(-) is 

symmetric uniformly positive definite, H(-) is symmetric uniformly positive semidefinite, and 
M(T) is symmetric positive semidefinite. 

NF refers to the case when f, a are linear is x, u, and £, tp are quadratic in x, u. Therefore, the 
dynamics also include stochastic integral terms which are linear is x, u. 

(LQF): Linear-Quadratic Form. A team game is said to have "normal form" if 



and R(-) is symmetric uniformly positive definite, H(-) is symmetric uniformly positive semidef- 
inite, and M{T) is symmetric positive semidefinite. 

NF refers to the case when f is linear in x,u, a is independent of x,u, and £, tp are quadratic in 
x,u; this is the classical linear-quadratic (dynamics, pay-off) model often utilized in centralized 
decision making. 

Below we compute the optimal strategies for the different cases of Definition [Q Although, 
we utilized nonanticipative strategies Ur^[0,T], these computations can be done for feedback 
strategies uS' 2 " '[0,T\. 

Case GNF. 

Utilizing the definition of Hamiltonian (fT8l) . its derivative is given by 



U u (t,x,i(),Q,u) = g*(t,x)ij + ^2s*(t,x)Q ( - i \t) + R(t,x)u + rj(t,x), (t,x) e [0,T] x M n .(36) 





(34) 




(35) 



rn 



i=l 
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Since the diffusion coefficient o depends on u, then Q°(-) also depends on the control and by 
Remark |2l Q° is given by 



Q°(t) =r x (tMt,x (t),u° t ), 
Q {i) '°(t) =r x (t){^(t,x°) + 8i (t,x°x}, t G [0,T], 2 = 1,2, 



m. 



(37) 
(38) 



Define the quantities 



A(*,x,VO=XX(*,s)^« W (*, 



x = 



i=l 



A 1 
A 2 



A 



A' 



M(t, x, ip x ) = x)Mt)si(t, 



X 



i=l 



Mn Mi 2 
M 2 i M 22 

Mjvx 



M 1A r 

M 2N 



M, 



NN 



where A 1 G £(M m ,IT l )> M tj G £(R d i,R*), i,j — l,2,...,N. 

By Theorem |2l substituting Q°(-) given by (1381) into (136)) . and utilizing the fact that u t '° is 
£7ot — adapted for each i G Z^, the explicit expression for u t '° is obtained from (I24T ). and it is 
given by 



i,o 



E ((R ii + M u )(t,x°{t),r x (t))\a 



0,f 



E ^(t,x°(t))+A J (t,x°(t),C(t))|a 



o./ 



+E^*(t,x)^(t)iej, Vf 

Next, we make some observations. 



PU -a.s., i = l,2 



, ^, . . . 5 



A. 



(39) 



(01): At any t G [0,T], i4'° is a functional of estimates of all other optimal decisions 
u t°ij 7^ * given its own information. Such strategies impose a heavy computational 
burden on any decentralized decision maker. Therefore, a question which might be of 
interest to address, is "what information needs to be signal among the DM's to reduce 
computations?" The answer to this question will become apparent when we proceed to 
compute the explicit expressions of the optimal strategies. 
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(02): In the simplified case of diagonal M, R, the right side of (|39T ) does not depend directly 
on estimates of the other DM's, but this dependence is hidden in the adjoint process 
ip°(-). In fact, since no communication exchange is allowed between the DM's, then 
any communication between them is made via the interaction of the DM's with the 
state and adjoint processes x°(-),ip°(-). One may view the stochastic differential system 
together with the adjoint backward stochastic differential equation as playing the role 
of a channel that makes communication between the DM's possible. Therefore, an 
interesting question is "can we quantify the amount of information communicated 
among the DM's via the Hamiltonian system of equations and if so, can we utilize 
this insight to reduce the computational burden, by allowing limited signaling between 
the DM's?" We shall return to this question and identify the variable which are involved 
in such communication between the DMs. 
Finally, note that the optimal strategies can be further simplified by assuming g(t,x) is linear 

in x and a(t, u) is linear in x, R(-, ■) is independent of x, A(-, ■) is quadratic in x, and rj(-, ■) 

is linear in x. 



Case SNE 

For a team game of simplified generalized form, the diffusion coefficient a is independent of 
u, therefore the second right hand side term of (l36l is zero (since Sj = 0, % — 1, 2, . . . , m), and 
the derivative of the Hamiltonian is linear in u. Therefore, the explicit expressions for u l '° are 
obtained from (|39T ) by setting — 0, A 1 — 0, i, j = 1, . . . , N, hence 

-l N 
«r = -{E(^(t,x°(t))|^ t )} {E( V \t,x°(t))\Go,t)+ E E(^(^(t)KTo lt ) 

-E(g^*(t,x)r(t)\gi t )}, F\ %t -a.s., i = 1, 2, . . . , N. (40) 

By comparing the optimal strategies for GNF given by (|39| ) and (|4Q|) we have the following 
observation. 

(03) : When a is independent of u, then the adjoint process Q°(-) is independent of u°, and 

therefore the optimal strategies do not involve derivatives of the adjoint process ip x {-) 
as in d39l). 

(04) : The team game formulation also includes as a special case, distributed estimation as 

follows. Suppose each component of the vector x — (x * • • • X ) denotes the channel 
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output at different distributed receivers carrying an information message (RV) 9 — 
(9 1 , . . . 9 N ), 9 % : Vt — > R n % i — 1, . . . , N, and each channel is subject to feedback and 
interference from the other channels. Then each channel outputs can be described by 

N 

dx\t) =&*(*, ar'(t), 0*)di + K l {t, x\t))dW l + ^ b ij {t, x 3 {t),9 3 )dt 

N 

+ K ij {t,x j {t))dW j , x i {0)=x i , i = l,...N. (41) 

Thus, {x l (t) : < t < T} describes the channel output of the ith receiver which is 

subject to feedback and interference from the other channels, 9 l is the message to be 

estimated at the ith receiver, and u l t ({x l (s) : < s < t}) is its team optimal estimator 

at time t E [0, T], based on having access to x\ Then the optimal distributed team 

estimators are obtained from (l40l i. and they are given by the following equation. 

-l N 
iF = -{E(R u (t,x(t))\S*t)} {E( V \t,x(t))\^t)+ E E(i2y(t,x(t))«j'°|C?Sj)} 

¥\ rxi -a.s., i = l,2,...,N. (42) 

y 0,t 

One may consider several other scenarios of distributed estimation by considering 
specific pay-off function £(t, x, u) which represents estimation error. 

Case NE 

For a team game of normal form define the quantities 

■m 

A(t,x,ifj x )=J2^(t)Mt)^i(t)x + G^(t)~) EE 
1=1 



i=l 

where Mjj e £(R dj ', M. di ), i,j = 1, 2, . . . , N. Then from the optimal strategies under GNF one 



A 2 (t,x,i) x 



A 1 e £(R n ,R di ),i= 1 



A N (t,x,iP x 



Mn M12 ... Mitv 
M 2i M 22 ••• M 27V 



M N1 M N2 



M 



NN 
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obtains 



i n 

' jm*(t) + E\j2E ij (t)x*> (t) + T%x°{t)^ x )\gl t 

3=1 



N 



+ (( R ^ + E [ M ^Mt))<\Q l o, t 

+ B^*(t)E[r(tw ,]}, pi 



(43) 



\g* t -a.s., i = 1,2, ...,7V. 

Another important observations is the following. 

(05): The expressions of the optimal team strategies can be written in a fixed point form. 
This is described next for the case LQF. 
Case LQF with E + 0, m ^ 0. 

For a team game of linear-quadratic form (with E, m non-zero) then from the previous optimal 
strategies one obtains 



N 



U, 



+ B^*(t)E^f(tWo, t )}, V\g' 0it -a.s., i = 1,2,..., N. (44) 

Note that (l44l) can be put in the form of fixed point matrix equation with random coefficients 
as follows. Define 



u 



u^°{t) =Vector{v?^ (t), . . .,u N ^°(t)}, x^°{t) = Vector{x 1 ' i '°{t), . . .,x N ^°{t)}, i,j = l...,N, 
u°(t) =Vector{v£°(t), . . .,u$°(t)}, x°(t) = Vector (x^°{t), . . .,x*&(t)}, (45) 



r(t) =Vector{E ( ^(t) ),..., E ( <(f ) ) } 



Rn(t), . . . , RiN^t) 



E®(t) 



En(t), . . . , E iN (t) 



i = l,...,N. (46) 



Taking expectation of both sides of (1441) with respect to Q' l 1 then (1441) is written in terms of 
linear equation with random coefficients as follows. 

diag{i? [1] (£), . . . , R [N] (t)}u°(t)+dizg{E [1] (t), E [N] (t)}£°(t) 

+ diag{i3 (1) '*(t), . . . , B {N) '*{t)}i?{t) + m{t) = 0. (47) 
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Clearly, (147b can be solved via fixed point methods, provided we determine the estimates 
x°(t), 4>°{t). In the next subsection we determine the estimate x°(t), and also show that 1ft (t) 
can be expressed in terms of the estimates x°(t), u°(t). 

We conclude this section by observing that the optimal team strategies involve conditional 
expectations with respect to the DMs information structures. These conditional expectations 
can be simplified considerably by allowing signaling between the different DMs. 

A. Team Games of Normal Form: Explicit Expressions of Adjoint Processes 

In this section we concentrate on Normal Form (and Linear- Quadratic Form) games, and we 
derive explicit expressions for the adjoint processes of Q°(-) as a functional of x°(-),u°(-). 
Note that this is a necessary step before one proceeds with the computation of the explicit form 
of the optimal decentralized strategies, or the computation of them via fixed point methods as 
in @3. 

For a game of Normal Form the Hamiltonian system of equations are the following. 

H(t,x,ip,Q,u) =(A(t)x + b(t) + Bu,tp) + tr(Q*a(t,x,u)J 

+ -(x, H{t)x) + -(u, R(t)u) + (x, F(t)) + (u, E{t)x) + (u, m(t)), (48) 

where a is given by (TUT) . The derivative of the Hamiltonian with respect to u is 

in 

n u (t, x, ifj, Q, u) = B*(t)tp + R{t)u + E(t)x + m(t) + ^ s*{t)Q il \t). (49) 

1=1 

Let (x (-),ip°(-),Q°(-)) denote the solutions of the Hamiltonian system, corresponding to the 
optimal control u°, then 

m 

dx°(t) =A{t)x°{t)dt + b{t)dt + B{t)u°dt + ^ ( K i(t)x 

i=l 

+ Si (t)u°)dWi(t) + G(t)dW(t), x°(0) = x , (50) 



dtfj°(t) = - A*(t)ip°(t)dt - H(t)x°(t)dt - F{t)dt - E*(t)u°dt 

- V QO {t)dt + Q°(t)dW(t), rj)°(T) = M(T)x°(T) + N(T), (51) 

m 

V Q o{t)=Y,<{t)Q^ it), (52) 
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Next, we find the form of the solution of the adjoint equation (l5Tb (and also identify the 
martingale term in (15TT ) via an alternative method to Remark |2]). Let {$(£, s) : < s < t < T} 
denote the transition operator of A(-) and $*(•,•) that of the adjoint A*(-) of A(-). Then we 
have the identity &$*(t, s) = -A*(s)$*(t, s), < s < t < T. One can verify by differentiation 
that the solution {ip°(t) : t G [0, T]} of (ED, is given by 



V>°(t) =$*(T, t)M(T)x°(T) + N(T) + / $*(s, H(s)x°{s)ds + F(s)ds + E*(s)u°ds 



Since for any control policy, {x°(s) : < t < s < T} is uniquely determined from (1501 and its 
current value then (1531) can be expressed via 



where S(-),/3°(-) determine the operators to the one expressed via (|53~T i. 

Next, we determine the operators (£(■), Differentiating both sides of (154b and using (150b . 

(I5TT) yields 





(53) 



=E{t)x°(t)+p°{t), te[0,T\ 



(54) 



A*(t)ip°(t)dt - H{t)x°(t)dt - F(t)dt - E*{t)u°dt - V Qa (t)dt + Q°(t)dW(t) 



=E(t)x°(t)dt + E(t) { A(t)x°(t)dt + b(t)dt + B(t)u°dt 



m 




(55) 



i=l 



By matching the intensity of the martingale terms {-}<iVF(£) in (|55b we obtain 



<7'°(*) =i>° x {t)a®{t,x°{t),u° t ) = V{t)(ni{t)ar{t) + Si {t)u° + G«(f)), t G [0,T], % = 1 



(56) 



and by (|52b we also obtain 



= E + + G®(*)) , t G [0, T] 



(57) 



i=l 



Clearly, Q° given by (156b is precisely the one predicted by Remark |2] 
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Substituting the claimed relation (1541 ) into (1551) we obtained the identity 

m m 

{ - A*(t)£(t) - - - ^<(t)E(i) Ki (i) - £(t)}x°(£)dt + Q {i) '°(t)dWi(t) 

i=l i=l 
m 

=A*{t)P°{t)dt + £(*)&(*) + Z{t)B{t)u°dt + ^ <(t)S(t) (s 4 (*K + G®{t)}dt 

i=i 

m 

+ £(t) ^ («i(t)x°(t) + Si (t)< + G«(t))dWi(t) + F(t)dt + £*(*)< + d/3°(t). 

i=l 

(58) 

Therefore, from (1561 we deduce 

m 

£(*)+A*(*)£(t) + + < (*)£(*)«*(*) + = 0, E(T) = M(T), (59) 

8=1 

+ A*(t)P°(t) + £(*)&(*) + F(t) + E(t)B(t)u°dt + E*(t)u° 

m 

+ J2<(tMt)(si(tK + G {i) (t)) =0= P°{T) = N{T). (60) 
i=i 

The closed form expressions of the adjoint processes C0°(-)> of this section are required 

in order to explicitly compute the closed form expression of the optimal decentralized strategies 
or apply fixed point methods via (PTTT) (in addition to solving centralized problems). 
Next we find the optimal strategy assuming centralized information structure for each DM, and 
then we determine the optimal strategies assuming decentralized information structures for each 
DM. The reason we pursue centralized strategies is to gain additional insight into its differences 
when compared to decentralized strategies, both in the procedure and the amount of complexity 
involved in implementing centralized versus decentralized strategies. 

B. Centralized Information Structure: NF and LQF 

First, we consider a centralized information structure and we compute the optimal strategy 
for team games of Normal and Linear-Quadratic forms. For any t G [0, T] the information 
structure Qq \ = Q^ t u V Q^ t u ... V QqI^ is available to all DMs and it is the a— algebra QqI — 
a{(x 1 (s),x 2 (s), . . . ,x N (s)) : < s < t} (we assume a strong formulation so the informa- 
tion depends on u). If instead, we consider nonanticipative centralized information structure 
g*(Q)> w = Qq}^ ,W \fGot°^' W ■■■V^ot^' W then me fi na l resu lts are the same. This is a 
common (centralized) full information structure decision strategy hence, the optimal decision 
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{u° t : < t < T} is found via 

E{n u (t,x (t),r(t),Q°(t),u° t )\gs;} = 0, a.e.t E [0,T], P|^ - a.s. (61) 

where (x (-),ip (-),Q°(-)) are solutions of the Hamiltonian system (1501 ), (I5TI ) corresponding to 
u°. Since (^"(Oj <5°(-)) 316 given by (|54l and (|56l ), respectively, all we need to do is to determine 
u° as a functional of x°. 
We show the following claims. 

LQF. When the system dynamics and pay-off are of Linear-Quadratic Form, the optimal cen- 
tralized strategy is given by 

u° t = -R- 1 (t)B*(t)K(t)x°(t), te[0,T], (62) 

where the operator K(t) E C(M. n ,M. n ) is the symmetric positive semidefinite solution of the 
differential equation 

K{t) + A*(t)K(t) + K(t)A(t) - K(t)B(t)R" 1 (t)B*(t)K(t) + H{t) = 0, t E [0, T), (63) 
K(T) = M(T). (64) 

NF with E = 0. When the system dynamics and pay-off are of Normal Form (with E = 0), the 
optimal centralized strategy is given by 

-l 



u° 



i=i i=i 
+ m(t) + B*(t)r(t)}, te[0,T], (65) 

where the operator K(t) E £(R",IR n ) is symmetric positive semidefinite, and r(t) E W 1 , and 
they are solutions of the differential equations 

m 

K{t) + A*{t)K{t) + K(t)A(t) + K>l(t)K(t)Ki(t) + H{t) 

i=i 

[K(t)B(t) + J2 i<(t)K(t) 8i {t)) (R(t) + J2 8t(t)K(t) 8i {t)) " 

i=l i=l 

m 

+ ^ s *(t)K(t)^(t)) =0, te[0,T), (66) 



1=1 



K(T) = M(T), (67) 
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r(t) + {A*(t) - (K(t)B(t) +J2<(tWt)si(t)) (R(t) +J2s* i (t)K(t)8 i (t)) 

i=l i=l 

+ F(t) + K(t)b(t) 

- (K(t)B(t) + J2^(tWt)s i (t)^R(t) + Y,4(t)K(t) Si (t)y m(t), te[[0,T), (68) 

i=l i=l 

r(T) = N(T). (69) 

Next, we verify the claim stated under LQF and we leave the claim stated under NF to the 
reader since its derivation is similar. 

Derivation of LQF Solution. From (16TT ) the optimal strategy is 

where (x°(-), Q°(-)) denote the solutions of the following Hamiltonian system, correspond- 
ing to the optimal control u° 

dx°(t) =A(t)x°(t)dt + B(t)u°dt + G(t)dW(t), x°(0) = x (71) 

diff°(t) = - A*(t)^°(t)dt - H(t)x°(t)dt - V Qo (t)dt + Q°{t)dW{t), i)°{T) = M(T)x°(T) 

(72) 

V Qo (t)=0, Q°(t) = Z(t)G(t). (73) 
Then {ip°(t) : t e [0,T]} is given by ([53]) with F = 0, E = 0, hence 

xjj°(t) = $*(T,t)M(T)x°(T) + J ®*(s,t){H(s)x (s)ds-Q (s)dW(s)y (74) 
For any admissible decision u and corresponding (x(-),ip(-)) define their filtered versions by 

m = E{ X (t)\gs tt } = x (t), m = E[m\Git}, m = E[ Ut \g^ = Ut , te [o,t\, 

and their predicted versions by 

x(s,t) = E{z(a)|0£ t }, #M) = E^(s)\g^y u(s,t) = e{m s |^}, < t < s < T. 
From (|70T ) the optimal strategy is 

u° = -R- x {t)B*{tyft{t), te[0,T]. (75) 
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Taking conditional expectations on both sides of (1741) with respect to Q^ t yields 

i^{t) = $*(T, t)M(T)&(T, t) + J $*(s, t)H(8)!F(s, t)ds, t e [0, T], (76) 

where we utilized the fact that 

E{ j\*{s,t)Q°{s)dW{s)\G*° t } = e{e{ j\*( Sl t)Q°(s)dW(s)\W , t }\g*° t } = 0. 

The predictor version of x°(-) is obtained from (|7TT ) utilizing the fact that the last right hand 
side of this equation is a stochastic integral with respect to Brownian motion, hence 

dx°(s, t) = A(s)z°(s, t)dt + B{s)v°{si t), t<s<T, (77) 

x 5 {t,t)=^{t) = x°(t), te[0,T). (78) 

Since for any policy and hence for the optimal u°, {x°(s, t) : < t < s < T} is uniquely 
determined from (1771 ) and the current value x°(t, t) = x°(t) via (I7TI) . then (|76|) can be expressed 
via 

$°(t) = K(t)x*(t) = K(t)x°(t), t G [0, T], (79) 

where iT(-) determines the operator to the one expressed via (f76l . Substituting (|79l) into (1751) 
we obtain (l62l) . Let {^^(i, s) : < s < t < T} denote the transition operator of Ax(t) = 
(A(t) - 5(t) J R- 1 (t) J B*(t)Ji(t)j and recall that the identities &V K (t,s) = A K (t)V K (t,s),Q < 
s<t<T, §- t ^ K (s,t) = -V K ls,t)A K (t),0< t<s<T. 

Next, we determine K(-). Substituting the solution of ([77]), CZH]>, specifically, x°(s,t) = ®k(s, t)x°(t), < 
t < s <T into (EU) we have 

r{t) = [$*{T,t)M{T)V K {T,t) + J $*{s,t)H(s)y K {s,t)ds}x°{t), t e [0,T], (80) 

and thus K(-) is identified by the operator 

/T 
$*(s,t)H(s)V K (s,t)ds, te[0,T}. (81) 

Differentiating both sides of (18TI) yields the following differential equation for K(-). 

K(t) = -<f>*(T,t)M{T)V K (T,t) + &(T,t)M(T)—* K {T,t) - H(t) 

/T a pT q 

-®*(s,t)H(s)y K (s,t)ds + J $*(s,t)H(s)-y K (s,t)ds 

= -A*(t)<f>*(T,t)M(T)y K (T,t) -$*(T,t)M(T)y K (T,t)A K (t) - H(t) 

A*(t)$*(s,t)H(s)V K (s,t)ds - [ ®*(s,t)H(s)^ K (s,t)A K (t)ds. (82) 
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Using (|81"I) in the previous equations we obtain the matrix differential equation (l63l . (|64T ). 
An alternative approach is to utilize (l54T ). (|59T ), (l60l ) (with z^, 6, F, E 1 , Sj = 0) which implies 

^°(t) = S(t)x°(t) + y $*(s,t)E(s)£(s)<cZs. (83) 

Then replace u°(-) in (f8~3l) by (1751) and take conditional expectation to obtain 

$°(t) = E(t)x°(t) - J ^*{s,t)E{s)B{s)R- 1 {s)B*{s)E^°{s)\g^ t jds. (84) 

Next, assume ip°(t) = K(t)x(t), for some K(-), and then substitute this in (f8~4l) to obtain 

K(t)=E(t)- J <5>*(s,t)i:(s)B(s)R-\s)B*(s)K(s)ty K (s,t)ds. (85) 

By utilizing the equation for £(•) it can be shown that (1851) is a solution of (1631) . (|64|) . 
The previous calculations demonstrate how to compute the optimal strategy when both decision 
variables are based on centralized information structures, and its is precisely the optimal strategy 
obtained via variety of other methods in the literature. 

Note that certain computations presented above are also required to compute an expression for 
the estimate entering the fixed point equation (|47T ). 

Finally, one can verify that the necessary conditions of optimality of Theorem |2] utilized to derive 
the above optimal strategy are also sufficient. Specifically, in view of Theorem |2] it suffices to 
show convexity of <p(x) = \{x,M{T)x) + (x,N(T)) and joint convexity of the Hamiltonian 
7i(t,x,i/j,Q,u) in (x,u). Since M(T) > then ip(x) is convex, and since H(-) > 0,R(-) > 
then H(t, x, ip, Q, u) is convex in (x, u). 



C. Decentralized Information Structures for LQF 

In this section we invoke the minimum principle to compute the optimal strategies for team 
games of Linear-Quadratic Form. We consider decentralized strategies based on 1) nonanticipa- 
tive information structures, and 2) feedback information structures. Without loss of generality 
we assume the distributed stochastic dynamical decision systems consists of an interconnection 
of two subsystems, each governed by a linear stochastic differential equation with coupling. The 
generalizations to an arbitrary number of interconnected subsystems will be given as a corollary. 
Consider the distributed dynamics described below. 
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Subsystem Dynamics 1: 

dx\t) =A ll (t)x 1 (t)dt + Bu(t)v%db + G X i(t)dW\t) 

+ A l2 {t)x 2 {t)dt + B 12 (t)u 2 dt, x\0) =xl, t G (0, T], (86) 

Subsystem Dynamics 2: 

dx 2 (t) =A 22 {t)x 2 {t)dt + B 22 {t)u 2 t dt + G 22 (i)(W 2 (t) 

+ A2 1 (t)x 1 (t)dt + £ 21 wJeft, ^ 2 (0)=x^, ie(0,T] (87) 
For any t G [0, T] the information structure of u\ of subsystem 1 is the a— algebra Qq t , and 
information structure of u 2 of subsystem 2 is the cr— algebra Q^ t . These information structures 
are defined shortly. 

Pay-off Functional: 



J(u\u 2 




,R(t) 




dt 



(88) 



We assume that the initial condition x(0), the system Brownian motion : t G [0,T]}, and 

the observations Brownian motion {B 1 ^) : t € [0,T]}, and {B 2 (t) : t G [0,T]} are mutually 
independent and x(0) is Gaussian (E(x(0)), Ccw(x(0))) = (x ,Po)- 
Define the augmented variables by 



A 



X 



X 



X 



u 



u 



u 



1l> 



A 




Q 



A 




w 



A 




and matrices by 



.4 



A n 


A l2 




B n 


B\ 2 




B n 


, B^ t 


B\ 2 


A 21 


A 22 




B 2 \ 


B 22 




B 2 \ 




B 22 



G 



A 



(89) 

Gn 
G 22 



February 15, 2013 



DRAFT 



28 



Let (x (-),ip°(-),Q°(-)) denote the solutions of the Hamiltonian system, corresponding to the 
optimal control u°, then 

dx°(t) =A(t)x°(t)dt + B(t)u°dt + G{t)dW{t), x°(0) = x„, (90) 



dif>°(t) = - A*(t)ip°(t)dt - H(t)x°(t)dt - V£(t)dt + Q°(t)dW(t), ip°(T) = M(T)x°(T), 

(91) 

V QO (t) =0, Q°{t) = E(t)G(t), r(t) = Z(t)x°(t) + (3°(t), (92) 

where £(■), are given by (l59l , (l60l with s i; Kj, b,F,E = 0. The optimal decisions '°) : 

< t < T} are obtained from (gS) with cr(t, x, u) = 6 = 0, F = 0, E = 0, m = 0, and 
they are given by 

a.e.te[0,T], F\ g i t -a.s. (93) 

E{n u2 (t, s ll0 (t), a^°(t), ^°{t),Q l '°{t), Q 2 >°(t), ul>°, u 2 t fi )\gl t ) = 0, 

a.e.te[0,T], P\ glt -a.s. (94) 
From (l93l . (l94l the optimal decisions are 

uY = -R^(t)BW'*(t)m{r(t)\g 2 0it } - R^(t)R 2l (t)E{ul>°\gi t y t e [o,t\. m 

From the previous expressions we notice the following. 

(06): The optimal strategies (1931 ), (1961 illustrate the signaling between u 1 and u 2 , which 
is facilitated by the coupling in the pay-off via R( ), and the coupling in the state 
dynamics of x 1 and x 2 via ip°(t) = H(t)x°(t)+f3 (i). Clearly, u 1 ' estimates the optimal 
decision of subsystem 2, u 2 '°, and the adjoint processes ip° from its observations, and 
vice-versa. This coupling is simplified if we consider a simplified model of dynamical 
coupling between subsystems x 1 ^ 2 and/or nested information structures, i.e., g^ t C 
g\ t . Moreover, if we consider no coupling through the pay-off, i.e., a diagonal R(-), 
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then the second right hand side terms in (1951 ), (1961 ) will be zero, implying that the 
signaling between u 1 ' , u 2, ° is done via the adjoint process tp°. 
Let (/>(•) be any square integrable and Fy— adapted matrix-valued process or scalar-valued 
processes, and define its filtered and predictor versions by 

7r 4 (0)(t) = E{0(t)|^ t }, Tr^XMj^E^Ca)!^}, t e [0,7], s>t, i = 1,2. 

For any admissible decision u and corresponding (x(-),tp(-)) define their filter versions with 
respect to Qq t for i = 1, 2, by 



ix\x){t) 



Am) 



and their predictor versions by 

7T l (x)(s,t) 



n\u)(s,t) 



A 


E< 


'At)\G l o,t] 






E< 






A 


E< 




M 




E< 







A 


E< 








E< 


'u 2 t \G l ,t[ 





t e [0,7], i = 1,2, 



^(t), te [0,T], 2 = 1,2, 



v*(t), t e [0,T], i = 1,2, 



A 


E< 


x 1 








E< 








A 


E< 


y 








E< 


V 2 







A 



E 



o.t 



x*(M)> t G [0,T], s > t, i = 1,2, 



= $*(s,t). t G [0,T], s > t, i = 1,2. 



t e [0,T], s > t, i = 1,2, 



From (1951 ). (1961 ) the optimal decisions are 

-R^{t)B^*{ty{r){t) - Rn\t)R 12 (t)E{u 2 t '°\gl t \, t e [0,7], 



ho 



2,o 



(97) 



u. 



-R^(t)B^*(t)7r w \r)(t) - R^ 1 (t)R 21 (t)E{u 1 t '°\gl t y t e [0,7]. (98) 
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The previous optimal decisions require the conditional estimates 

{(7r 1 ('?/; )(t), -K 2 (ij) ){t)) : < t <T}. These are obtained by taking conditional expectations of 
(T741) giving 

7 r i (V' )(t) = <$>*(T,t)M(T)Tr i (x°)(T,t) + J <$>*( y s,t)H(s)7r i ( y x°)( y s,t)ds, ie[0,T], z = 1, 2. 

(99) 

Before we proceed further we shall specify the information structures available to the DMs. 

Nonanticipative Information Structures. The information structure available to u 1 is Q\ t — 
a{W 1 (s) : < s < t] = , and the information structure available to u 2 is Q\ t = a{W 2 (s) : 
< s < t} = Q^t ■ Therefore, by denoting ti w ' (•)(•) the conditional expectation with respect to 
i — 1,2, for any admissible decision, the filtered versions of x(-) based on this information 
structures are given by the following stochastic differential equations ll30ll (Theorem 8.2). 

dir w \x)(t) =A(t)ir w \x)(t)dt + B {1 \t)u]dt + B (2) (t)7i wl (u 2 )(t)dt 

+ G ll (t)dW 1 (t), tt w1 (x)(0) = x , (100) 

dn w \x)(t) =A(t)n w \x)(t)dt + B^(t)u 2 t dt + B^> \t)n w ' '\u l ){t)dt 

+ G 22 dW 2 (t), n w \x)(0) =x . (101) 

From the previous filtered versions of x(-) it is clear that subsystem 1 estimates the augmented 
state vector and the actions of subsystem 2 based on its own observations, namely, n wl (u 2 ){-) 
and subsystem 2 estimates the augmented state vector and the actions of subsystem 1 based on 
its own observations, namely, 7r"' 2 (n 1 )(-). 

For any admissible decision u the predicted versions of x(-) are obtained from (1 1001 ) and (11011 ) 
as follows. Utilizing the identity n w \x)( Sj t) = e|e|x(s)|^ w/ ;}|^} = E^n w \x)(s)\g^ t i y 
for < t < s < T then 

—n wl (x)(s,t) = A(s)n wl (x)(s,t) + B^is)^ 1 (u 1 )^) + B^ (s)tt w1 (u 2 )(s,t), t<s<T, 
ds 

(102) 

7T W \x)(t,t) =TT W \x)(t), te[0,T), (103) 
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—n w \x)(s,t) = A(s)7r w \x)(s,t) + ( s )n w2 : (u x )(s, t) + S«(sKV)(M), t < s < T, 
as 

(104) 

7T- 2 (x)(t,t) = TT w2 (x)(t), t E [0,T). (105) 

Since for a given admissible policy and observation paths, {7r w (x)(s, t) : < t < s < T} is 
determined from (11021 ) and its current value tt w (x°)(t,t) = n w (x)(t), and {-k w (x)(s,t) : < 
t < s < T} is determined from (11041 ). and its current value % w2 (x)(t, t) = ir w2 (x)(t), then (l99b 
can be expressed via 

vr- l (^°)(t) = /T^t^V) W + r 4 (t), £ G [0, T] , z = 1, 2. (106) 

where i^(-), r *(") determines the operators to the one expressed via (l99l . for i — 1,2. Utilizing 
(fl06l into d97]) and ® then 

-^/W^'^l^^ te[0,T], 

(107) 



1,0 



2,o 

t 



= -R^(t)B^*(t)\K 2 (t)7r w2 (x°)(t) + r 2 (t)\ - R^(t)R 21 (t)7r w2 (u l n(t), t e [0,T]. 



(108) 

Let {^ Kl (t,s) : < s < t < T} denote the transition operator of A K i(t) = (^A(t) - 
B^^R^^B^^K^t)^, for i = 1,2. 

Next, we determine K % (-),r' l (-),i = 1,2. Substituting the previous equations into (11021) . (11031) 
and (fT04l) . ([1051) then 

Wjci (s, t)BW (t)R£ (t)R 12 (t)tt w1 (u 2 >°) (r, t)dr 



+ y V K i (s, r)£ (2) (r)7r wl (w 2 '°) (t, £)dr, t < s < T, (109) 

tt™V)M) =^( a ,t)7r^(^)(t)-y , '* J ^(« > r)B^(r)i^ 1 (r)BW'*(r)r 2 (r)^ 
- j 3 (s, r)B« (r) i# (r) ifc (r)^ 2 (w 1 ' ) (r, *)dr 
+ y y K 2(s,T)B {1) (T)n w2 (u 1 '°)(T,t)dT, t<s<T. (110) 
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Since u\'° is —measurable and u 2 '° is —measurable, and and are independent, 
then n w \u 2 '°)(T,t) = E(u 2 T ) = S^(r), Tr^u 1 ' )^, t) = E«) = u^r), < t < r < T. 
Utilizing the last observation we show in the next main theorem that the optimal DM strategies 
are finite dimensional (i.e., given in terms of finite number of statistics), and that each optimal 
strategy is linear function of the augmented state estimate based on his information, and the 
average value of the other optimal strategy. The computation of the average optimal strategies 
can be expressed in fixed point form. 

Theorem 3. ( Optimal decentralized strategies for LQF ) 
Given a LQF game the optimal decisions (w 1,0 7 u 2, °) are given 

= -R^MB^it^KW^ix'm + r\t)} - R^(t)R 12 (t)T^(t), t G [0, T], 

(111) 

uY = -R^(t)B^*(t){K 2 (t)n w \x°)(t) + r 2 (t)} - ^(t)R 21 (t)u^(t), t G [0,T]. 

(112) 

where ix w \x°)(-), i = 1,2 satisfy the linear non-homogeneous stochastic differential equations 

dir w \x){t) =A(t)ir w \x)(t)dt + B^\t)u\'°dt + B^\tjuF°{t)dt 

+ G 11 (t)dW 1 (t), n wl (x)(0)=x , (113) 

dn w \x){t) =A(t)7r w \x)(t)dt + B {2) (t)u 2 '°dt + B {1) (t)u^(t)dt 

+ G 22 dW 2 (t), tt w \x)(0) = x . (114) 

and (^K l (-), r*(-), x°(-), ti*>°(-)J , i = 1, 2 are solutions of the ordinary differential equations i\115\) , 
ftTm . 47771) . (T77J), (T779D, ((lb) below. 

K\t) + A*{t)K\t) + K\t)A{t) - K*{t)B®{t)K£(t)B®>*{t)IC{t) 

+ H(t) = 0, t G [0,T), i = 1,2, (115) 
K i (T) = M(T), i = l,2, (116) 
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r\t) ={ -A*(t) + ^*(T,t)M(T)^ K i(T 1 t)B {1 \t)R^ 1 1 (t)B W '*(t) 



+ 



T 



^*(s,t)H(s)^ K i(s,t)ds)B^(t)R^ 1 (t)B W '*(t)}r 1 (t) 



T 



<f>*(s,t)H(s)* K i(s,t)ds) (B^ 2 \t) - B^(t)R^(t)R 12 (t))u 2 '°(t) 



- ^(T,t)M(T)^ K1 (T,t)^ 2 \t) - B^(t)R n \t)R 12 (t)j^(t) te[0,T), r\T) = 0, 

(117) 

f 2 (t) =\ - A* (t) + $*(T,t)M(T)q K2 (T,t)B {2 \t)R 2 - 2 1 (t)B {2) '*(t) 



+ 



<S>^s,t)H(s)y K 4s,t)ds)B( 2 \t)R; 2 \t)B^*(t)jr 2 (t) 
$*{s,t)H(s)y K 2(s,t)ds) (B^{t) - B^(t)R^i{t)R 21 {t))u^(t) 



- ^(T,t)M(T)^ K 2(T,t)[B^(t) - B^ 2 \t)R 22 \t)R 21 {t))u^{t), t e [0,T), r 2 (T) = 0, 

(118) 



x°(t) = A(t)x°(t) + B'^tjti 1 ."^) + B^(t)u 2 '°{t), x°{0) = x , 



(119) 







I 


R^(t)R 12 (t) 


-l 


u 2 <°(t) 




R 22 {t)R 2 i 


(t) I 




Proof: Since uj'" is - 


-measurable and u t ' 


°is 



R 22 \t)B^{t){K 2 {t)lF°(t) + r\t) 



(.120) 



independent, then 



tt-VXM) = E^tifl^J = E(t£) EE t < s <T, 

7r w \u 1 )(s,t) = E(u 1 s \gZ 2 )=^(ul) =u*(s), t<s<T. 



(121) 
(122) 
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Substituting (fi2li (fl22l) into (fl09l) . (fTTOl) . and then (fl09l) . (fTTOl) into d99j) we have 

7r™V)(t) ={$*(r,t)M(7^(r,*) + y%*M^ 

+ <T (T, t)M(T) ^ * K i{T, r) (b< 2 >(t) - B^(r)R^(r)R 12 (T)^ (T)dT 

+ jf i)#(s) jf * K i(s, r) (s (2) (t) - B^ (T)R^(r)R 12 (r)^(T)drds 

- $*(T,t)M(T) J ^> K i{T,r)B^\T)R^{T)B^'*{T)r l {T)dT 

-J $*{s,t)H(s) J ^ K i(s,T)B w (T)R u 1 (T)B {1) '*(Ty(r)dTds, (123) 



T 



n w (r)(t)=[$*(T,t)M(T)y K *(T,t) + J $*(s,t)H(s)y K2 (s,t)ds^ w (x°)(t) 

+ <S>*(T,t)M(T) I^ k2 (T 1 t)^ 1 \t) - B^\T)R^(T)R 21 (T))^ {r)dr 

+ J <S>*(s,t)H{s) J $ K2 (s,r)(BW(r) - B^ (r)R^(r)R 21 (T^u^(T)drds 
-$*(T,t)M(T) j\ K .{T,r)B^{r)R^{r)B^{r)T\r)dT 

-J $*{s,t)H{s) J m K 2{s,T)B {2 \T)R 22 1 {r)B^*{T)r 2 {T)drds. (124) 

Comparing (11061 ) with the previous two equations then K l (-),i = 1,2 are identified by the 
operators 

K i {t) = $*{T,t)M{T)V Ki {T,t) + J $*{s,t)H{s)y Ki {s,t)ds, t e [0,T], i = 1,2, 

(125) 

and r l (-), z = 1, 2 by the processes 

r\t) =$*(T,t)M(T) J\ ki (T,t)^ 2 \t) - B^\r)R^\r)R 12 (T)^ {T)dT 

+ ^ ®*(s,t)H{s) J\ k i(s,t)^ 2 \t) - B^\r)R^{T)R 12 (T)^{r)dTds 

- $*(T, t)M(T) ^ V K i (T, r) (r) i^ 1 (r)5 (1) '* (r)r 1 (r)dr 

~y $*{s,t)H(s) J ^ K i(s,T)B {1) (T)R n 1 (T)B W '*(T)r 1 (T)dTds, (126) 
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r 2 (t) =&(T,t)M(T) y T W x2 (T,r)(s«(r) - B^\r)R^{r)R 21 {r))u^{r)dT 

+ y ®*(s,t)H(s) J ^(s,t)(b (1) (t) - B (2) (r)i^ 1 (r)i2 2 i(r))u i ^(r)tZrtZa 
-<T(T,t)M(T) j T ^ K ,(T,r)B^(T)R^(r)B^*(Ty(r)dT 

$*(8,t)H(s) J ^ K 2(s,T)B {2) (T)R^(T)B {2) '*(r)r 2 (T)dTds. (127) 

Differentiating both sides of (11251) the operators K' l (-),i = 1,2 satisfy the following matrix 
differential equations (II 15I ). (II 161 ). Differentiating both sides of (11261) . (11271) the processes 
r'(-),i = 1,2 satisfy the differential equations (11171) . (11181) . Utilizing (I121|) . (|122|) we obtain 
the optimal strategies (II 111) . (II 12I ). Next, we determine u l >° for i = 1, 2 from (|1 111) . (II 12I ). 
Define the averages 

x(t) = e{x(*)} = E|7r w< (x)(t)|, 2 = 1,2. (128) 

Then satisfies the ordinary differential equation (111 9b . Taking the expectation of both sides 
of (II 111) . (II 121) we deduce the corresponding equations 

u^(t) = -R~^{t)B^'* (t) jif 1 (t)x°{t) + r 1 ^)} - R^{t)R 12 {t)^{t), t e [0,T], (129) 

^(t) = -i^(t)B( 2 )^^^ ie[0,T]. (130) 

The last two equations can be written in matrix form (I120I ). This completes the derivation. ■ 
Hence, the optimal strategies are computed from (II 1 II ). (II 121) . where the filter equations for 
7T wi (x°)(-),i = 1,2 satisfy (fTT3l) . (fTT4l) . while (k\-), r*(-), «^(-)>^(0) = M are computed 
off-line utilizing the ordinary differential equations (11151) . (11161) . (II 171) . (Ill 81) . ( II 191) . (11201) . Note 
that the optimal decentralized strategy u 1 ' given by (II 1 II) is a linear function of the state 
£ 1,0 (-) and E(-u 2, °)(-), while the state is governed by (II 131) corresponding to u 2 '° replaced by its 
average value E(w 2, °)(-), and similarly for u 2 '°. The optimal strategies can be further simplified 
by considering special structures of interconnected dynamics, such as, coupling of the subsystems 
via the DM's, coupling through the pay-off only, diagonal matrices R = diag{R u , -R12}, H = 
diag{H u , H 22 }, M = diag{M u , M 22 }, etc. 

Further, Theorem |3] can be generalized to an arbitrary number of interconnected system team 
games. In addition, one may consider feedback information structures, delayed information 
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structures, etc.. 

These generalization or simplification are stated in the next remark. 
Remark 4. (Generalizations and Simplifications) 

Generalizations. Theorem \3\ is easily generalized to the following arbitrary coupled dynamics 

dx\t) =A ii (t)x i {t)dt + B {l) uldt + GadW^t) 

N N 

+ AjX j (t)dt+ B u \t)u{dt, x i (0) = xl, t£(0,T], i£Z N (131) 

and DM's information structures 

u\ is G^t ~ measurable, t £ [0,T], i£ r L^. (132) 
The optimal strategies are obvious extensions of the ones given in Theorem \3\ 

Simplifications. Several simpler forms can be deduced from the results of Theorem \3\ by as- 
suming any of the following R = diag{Rn, R12}, H = diag{Hn, H22}, M = diag{Mu, M22}. 
Moreover, simplified strategies can be derived by assuming nested information structures, that 
is, u\ is —measurable and u 2 is Q^ t ,w —measurable. 

Delay Information Structures. The optimality conditions hold for any t — measurable DM 
strategies u l , i = 1, . . . , N. Therefore, one can apply the necessary conditions to DM's informa- 
tion structures 

u\ is Q^ t _ e . — measurable, > 0, t £ [0,T], %£TL N . (133) 
or any other information structures of interest, such as, delayed sharing. 

Feedback Information Structures. The previous generalizations/ simplifications also apply to 
feedback information structures Gof- Specifically, to derive the corresponding results of The- 
orem \3\ even for the simplest scenario z l = x 1 ^ 2 = x 2 , one has to compute conditional 
expectations with respect to Qq^ ',i = 1,2, and hence one has to invoke nonlinear filter- 
ing techniques to determine expressions for the filters ix x \x)(t) = E|x(£)|^o,t j> ^ = 1>2, 
n x2 (u l )(t) = E|iiJ|^Q j j U |,7r zl (M 2 )(t) = e|u^q^' u |, (and predictions of x(t),u\,u 2 ). It ap- 
pears to us that the optimal team laws are the same as those derived for nonanticipative 
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information structures, given by ([777]), ( 17721) . with 7r w \x)(t) replaced by % x '(x){t),i = 1,2 and 
M 1,0 (t), u 2 '°(t) replaced by n x2 (u 1 )^),^ 1 (u 2 )(t). These estimates (filters) may not be described 
in terms of linear Kalman-type equations driven by the DMs strategies governing the conditional 
means, whose gains are specified by the conditional error covariance equations, independently 
of the observations. A possible approach is to compute these conditional expectations is the 
identification of a sufficient statistic as in /|371/-/|33l/. 

Signaling. Given the optimal decentralized strategies of Theorem\3\we can determine the amount 
of signaling among the DMs to reduce the computational complexity of the optimal strategies. 

IV. Conclusions and Future Work 

In this second part of our two-part paper, we invoke the stochastic maximum principle, 
conditional Hamiltonian and the coupled backward-forward stochastic differential equations 
of the first part [1J to derive team optimal decentralized strategies for distributed stochastic 
differential systems with noiseless information structures. We present examples of such team 
games of nonlinear as well as linear quadratic forms. In some cases we obtain closed form 
expressions of the optimal decentralized strategies. 

The methodology is very general, and applicable to several types of information structures 
such as the ones described under Remark |4] It will be interesting to consider additional types 
of information structures and compute the optimal decentralized strategies in closed form, to 
better understand the implications of signaling and computational complexity of such strategies 
compared to centralized strategies. 
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